Emergent: a novel data-set for stance classification
نویسندگان
چکیده
We present Emergent, a novel data-set derived from a digital journalism project for rumour debunking. The data-set contains 300 rumoured claims and 2,595 associated news articles, collected and labelled by journalists with an estimation of their veracity (true, false or unverified). Each associated article is summarized into a headline and labelled to indicate whether its stance is for, against, or observing the claim, where observing indicates that the article merely repeats the claim. Thus, Emergent provides a real-world data source for a variety of natural language processing tasks in the context of fact-checking. Further to presenting the dataset, we address the task of determining the article headline stance with respect to the claim. For this purpose we use a logistic regression classifier and develop features that examine the headline and its agreement with the claim. The accuracy achieved was 73% which is 26% higher than the one achieved by the Excitement Open Platform (Magnini et al., 2014).
منابع مشابه
Automated Classification of Stance in Student Essays: An Approach Using Stance Target Information and the Wikipedia Link-Based Measure
We present a new approach to the automated classification of document-level argument stance, a relatively underresearched sub-task of Sentiment Analysis. In place of the noisy online debate data currently used in stance classification research, a corpus of student essays annotated for essaylevel stance is constructed for use in a series of classification experiments. A novel set of features des...
متن کاملFrom Argumentation Mining to Stance Classification
Argumentation mining and stance classification were recently introduced as interesting tasks in text mining. In this paper, a novel framework for argument tagging based on topic modeling is proposed. Unlike other machine learning approaches for argument tagging which often require large set of labeled data, the proposed model is minimally supervised and merely a one-to-one mapping between the p...
متن کاملS3PSO: Students’ Performance Prediction Based on Particle Swarm Optimization
Nowadays, new methods are required to take advantage of the rich and extensive gold mine of data given the vast content of data particularly created by educational systems. Data mining algorithms have been used in educational systems especially e-learning systems due to the broad usage of these systems. Providing a model to predict final student results in educational course is a reason for usi...
متن کاملSimple Open Stance Classification for Rumour Analysis
Stance classification determines the attitude, or stance, in a (typically short) text. The task has powerful applications, such as the detection of fake news or the automatic extraction of attitudes toward entities or events in the media. This paper describes a surprisingly simple and efficient classification approach to open stance classification in Twitter, for rumour and veracity classificat...
متن کاملFDiBC: A Novel Fraud Detection Method in Bank Club based on Sliding Time and Scores Window
One of the recent strategies for increasing the customer’s loyalty in banking industry is the use of customers’ club system. In this system, customers receive scores on the basis of financial and club activities they are performing, and due to the achieved points, they get credits from the bank. In addition, by the advent of new technologies, fraud is growing in banking domain as well. Therefor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016